A standardized general framework for encoding and exchange of corpus annotations: The Linguistic Annotation Framework, LAF

نویسنده

  • Kerstin Eckart
چکیده

The Linguistic Annotation Framework, LAF, proposes a generic data model for exchange of linguistic annotations and has recently become an ISO standard (ISO 24612:2012). This paper describes some aspects of LAF, its XML-serialization GrAF and some use-cases related to the framework. While GrAF has already been used as exchange format for corpora with several annotation layers, such as MASC and OANC1 the generic LAF data model also proved useful as the basis for the design of data structures for a relational database management system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Representing Linguistic Corpora and Their Annotations

A Linguistic Annotation Framework (LAF) is being developed within the International Standards Organization Technical Committee 37 Sub-committee on Language Resource Management (ISO TC37 SC4). LAF is intended to provide a standardized means to represent linguistic data and its annotations that is defined broadly enough to accommodate all types of linguistic annotations, and at the same time prov...

متن کامل

Towards International Standards for Language Resources

This paper describes the Linguistic Annotation Framework (LAF) developed by the International Standards Organization TC32 SC4, which is to serve as a basis for harmonizing existing language resources as well as developing new ones. We then describe the use of the LAF to represent the American National Corpus and its linguistic annotations.

متن کامل

LAF-Fabric: a data analysis tool for Linguistic Annotation Framework with an application to the Hebrew Bible

The Linguistic Annotation Framework (LAF) provides a general, extensible stand-off markup system for corpora. This paper discusses LAF-Fabric, a new tool to analyse LAF resources in general with an extension to process the Hebrew Bible in particular. We first walk through the history of the Hebrew Bible as text database in decennium-wide steps. Then we describe how LAF-Fabric may serve as an an...

متن کامل

A LAF/GrAF based Encoding Scheme for underspecified Representations of syntactic Annotations

Data models and encoding formats for syntactically annotated text corpora need to deal with syntactic ambiguity; underspecified representations are particularly well suited for the representation of ambiguous data because they allow for high informational efficiency. We discuss the issue of being informationally efficient, and the trade-off between efficient encoding of linguistic annotations a...

متن کامل

The Linguistic Annotation Framework: a standard for annotation interchange and merging

This paper overviews the International Standards Organization Linguistic Annotation Framework (ISO LAF) developed in ISO TC37 SC4. We describe the XML serialization of ISO LAF, the Graph Annotation Format (GrAF) and discuss the rationale behind the various decisions that were made in determining the standard. We describe the structure of the GrAF headers in detail and provide multiple examples ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012